How to Calculate a Percentile

Percentiles help us understand how a data value compares to the rest of a dataset by determining what percentage of values fall below it. This page explains how to calculate percentiles manually, interpret their meaning, and use tools like GeoGebra to find percentiles efficiently.

Percentiles By Hand

What is a Percentile?

A percentile is a measure that indicates the relative standing of a data point within a dataset. The \( p^\text{{th}} \) percentile of a dataset is the value below which \( p\% \) of the data falls.

Formula for the Percentile from a Set of Data:

The percentile of a data value \( x \) is given by: \[ \text{{Percentile}} = \left( \dfrac{\text{Number of values less than } x}{\text{Total number of values}} \right) \times 100 \]

How do I find the data value for a specific percentile?

To determine the value that corresponds to the \( p^\text{{th}} \) percentile in an ordered dataset of size \( n \), follow these steps:

  • Step 1: Compute \( \dfrac{{p}}{{100}} \times n \). This gives the index (position) of the percentile in the ordered dataset.
  • Step 2: Use the result from Step 1 to locate the percentile value:
    • If the result is a decimal, round down to the previous whole number AND round up to the next whole number and take the average of the values in those two positions.
      Example: If you get 39.4, average the values in the \(39^\text{{th}}\) and \(40^\text{{th}}\) positions in the ordered list.
    • If the result is a whole number, add one to that number and take the average of the values in those two positions.
      Example: If you get 42, take the average of the values in the \(42^\text{{nd}}\) and \(43^\text{{rd}}\) positions in the ordered list.

Notation

We often denote percentile as \(P_n\), where \(n\) is the percentile.  For example, the \(45^\text{{th}}\) percentile would be denoted \(P_{{45}}\).

 Example 1

The following data represents the number of three-point shots made by 50 randomly selected NBA players who played at least 50 games in a season. 

  • Part A: Find the \(80^\text{{th}}\) percentile of the dataset.
  • Part B: What is the percentile rank of a player who made 180 three-pointers?
Three-Point Shots Made by 50 NBA Players
Number of Three-Point Shots (Ordered)
5 8 12 18 22 28 35 40 48 55
60 68 72 80 85 90 98 105 110 118
125 130 135 140 148 150 158 165 170 175
180 185 190 198 205 210 220 230 240 250
260 270 280 290 300 310 320 330 340 350

Solution

  • Part A: Finding the \(80^\text{{th}}\) Percentile

    The 80th percentile is the value below which 80% of the data falls. We calculate the position in the ordered where the \(80^\text{{th}}\) percentile would appear:\[ \begin{align*}\text{position in list} &= \dfrac{{p}}{{100}} \times n\\\\&=\dfrac{{80}}{{100}}\times 50\\\\&=40\end{align*} \] Since 40 is a whole number, we take the average of the 40th and 41st values in the ordered dataset. Thus, the 80th percentile is \[ \dfrac{250 + 260}{{2}} = 255\text{ three-point shots.} \]

  • Part B: Finding the Percentile Rank of 180 Three-Pointers

    The percentile rank is found using the formula:\[ \text{{Percentile}} = \left( \dfrac{\text{Number of values less than } x}{{n}} \right) \times 100 \]For \( x = 180 \), there are 30 values below 180 in the dataset. We get that \[ \left( \dfrac{{30}}{{50}} \right) \times 100 = 60 \] 180 is the \(60^\text{{th}}\) percentile, and we denote it as \(P_{{60}}=180\).

$$\tag*{\(\blacksquare\)}$$

Percentiles using GeoGebra

 Example 2

The following data represents the number of three-point shots made by 50 randomly selected NBA players who played at least 50 games in a season. 

  • Part A: Find the \(70^\text{{th}}\) percentile of the dataset.
  • Part B: What is the percentile rank of a player who made 180 three-pointers?

Solution

  • Part A:

    Copy the data into the Summary Statistics Calculator and then hide the spreadsheet. Click on the Percentile checkbox, then type 70 into the box that appears and press enter. The value should update and read \(P_{{70}}=208.5\). Since we can't have half-a-player, round up to 209. Therefore, the \(70^\text{{th}}\) percentile is 209.

    Summary Statistics Calculator shows that the 70th percentile is 208.5.

    Note: If you need to round, ignore the normal rounding rules and always round up since you always want all the values less than the one given to you by GeoGebra.

  • Part B:

    Copy the data into the Frequency Distribution Tool, and set the options as follows:

    Deselection Organize Data Into Classes, Select Relative Frequencies, Cumulative Frequencies, and Display as Percents.

    A recent change to the tool makes the table moveable.  Move the table until the Raw Data value 180 appears:

    180 has a cumulative relative frequency of 62%.

    Recall that cumulative relative frequency gives the percentage that is less than or equal to 180, but we want the percentage that is less than 180 for a percentile.  So, we subtract 1 to get that 180 is the \(61^\text{{st}}\) percentile.

$$\tag*{\(\blacksquare\)}$$

What is the Difference Between Cumulative Relative Frequency and Percentiles?

  • Cumulative Relative Frequency calculates the proportions/percentage of data less than or equal to a number in the dataset.  
  • Percentiles calculate the percentage of data less than a number in the range of the dataset
    • Note that range means any value between the smallest and largest value in the dataset, not just the numbers in the dataset.

What is the relationship between the Median and Percentiles?

The median is another name for the \(50^\text{{th}}\) percentile: \[\text{{Median}}=P_{{50}}.\] Do note that technology often calculate percentile and medians a little different, since median is typically a calculation for discrete data and percentile is a calculation for continuous data. We calculate percentiles for discrete data, we are pretending its continuous, but in order for it to make sense as discrete data, we often have to make minor adjustments and there isn't exactly one unique way to do this. But this leads to only minor differences in value between the median and \(P_{{50}}\) that are often very trivial.

Interpreting Percentiles

Example 3

Percentiles help compare a data value to the rest of a dataset. For each of the following scenarios, interpret the meaning of the given percentile.

  • Part A: A baby’s weight is in the 85th percentile.
  • Part B: A household income is at the 70th percentile.
  • Part C: A patient’s blood pressure is in the 40th percentile.
  • Part D: A student’s GPA is in the 75th percentile in their school.
  • Part E: A machine produces parts in the 98th percentile for accuracy.

Solution

  • Part A:

    A baby’s weight is in the 85th percentile. This means the baby weighs more than 85% of other babies in the same age group and less than 15% of them.

  • Part B:

    A household income is at the 70th percentile. This means the household earns more than 70% of all households, while 30% of households earn more.

  • Part C:

    A patient’s blood pressure is in the 40th percentile. This means the patient’s blood pressure is lower than 60% of the population and higher than 40% of people.

  • Part D:

    A student’s GPA is in the 75th percentile in their school. This means the student has a higher GPA than 75% of their classmates, while 25% of students have a higher GPA.

  • Part E:

    A machine produces parts in the 98th percentile for accuracy. This means the parts are more precise than 98% of those produced by other machines, with only 2% of machines producing more accurate parts.

$$\tag*{\(\blacksquare\)}$$

Conclusion

By calculating percentiles, we can compare individual data points to a larger dataset. They also give us another tool besides the mean, median, and mode to describe how data is clustered and spread out, and we will explore that more when we study the Five-Number Summary.